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Abstract. Given a convergence theorem in analysis, a model-theoretic com- 
pactness argument can often be used to show that there is a uniform bound on 
the rate of metastability. We illustrate this with three examples from ergodic 
theory. 



1. Introduction 

Convergence theorems in analysis are often disappointingly nonuniform. For 
example, Krengel [2 7) has shown, roughly speaking, that even if one fixes an ergodic 
measure preserving system, the convergence of averages guaranteed by the mean 
ergodic theorem can be arbitrarily slow. The goal of this note is to show that 
even in such cases, a straightforward compactness argument can often be used to 
establish a weaker uniformity, namely, the existence of uniform bounds on the rate 
of metastable convergence. 

If (a n )neN is a sequence of elements in a metric space {X, d), saying that (a n ) is 
Cauchy is equivalent to saying that for every e > and function F : N — >• N, there 
is an n such that d(a,i,aj) < e for every i,j G [n,F(n)]. Think of F as trying to 
disprove the convergence of (a n ) by finding intervals where the sequence fluctuates 
by more than e; the n asserted to exist foils F in the sense that the sequence is 
remains e-stable on [n, F(n)]. We will call a bound on such an n, depending on F 
and £, a bound on the rate of metastability. 

The arguments below show that, in many convergence theorems, there is a bound 
on the rate of metastability that depends on only a few of the relevant parameters. 
All is that required is that the class of structures in question, and the hypotheses of 
the theorem, are preserved under a certain model-theoretic ultraproduct construc- 
tion in which these parameters remain fixed. This requirement can be formulated 
in syntactic terms, by asserting that the the relevant hypotheses and axioms can 
be put in a certain logical form. We spell this out in Section [2] which summarizes 
the necessary background on ultraproducts in analysis. Section [3] illustrates the 
method with three examples from ergodic theory. 

Metastablity has proved useful in ergodic theory and ergodic Ramsey theory 
[HmHH]; see also @3J Sections 1.3-1.4], and gj [3J [Ml EH 123 EHE |2B] for various 
instances of metastability in analysis. Tao [40] relates the existence of uniform 
bounds on the rate of metastable convergence of a collection of sequences to non- 
standard convergence statements in a corresponding ultraproduct. 

Sometimes stronger uniformities are available than the ones we consider here, in 
the form of square-function inequalities (for example, as in Jones, Ostrovskii, and 



2010 Mathematics Subject Classification. 46B08, 03C20, 37A30. 

Avigad's work has been partially supported by NSF grant DMS- 1068829 and AFOSR grant 
FA9550-12-1-0370. 



1 



2 



JEREMY AVIGAD AND JOSE IOVINO 



Rosenblatt |19j). or bounds on the number of fluctuations; see the discussion in 
[5] . Bergelson et al. [8] explore aspects of uniformity in ergodic theory and ergodic 
Ramsey theory; most of the methods there rely on specific combinatorial features 
of the phenomena at hand. 

The methods developed here complement proof-theoretic methods developed by 
Kohlenbach and collaborators, e.g. in [2TJ [12] . Roughly, those methods provide 
"metatheorems" which show that when a statement with a certain logical form 
is derivable in a certain (fairly expressive) axiomatic theory, certain uniformities 
always obtain. The arguments we present here replace derivability in an axiomatic 
system with closure under the formation of ultraproducts. Indeed, it seems likely 
that such arguments can be used to establish general metatheorems likes the ones 
in [2l\ 1 1 2 j . by considering ultraproducts of models of the axiomatic theories in 
question. 

It is worth noting that although the methods we describe here can be used to 
establish the existence of a very uniform bound, they give no explicit quantitative 
information at all, nor even show that it is possible to compute such a bound as 
a function of F and e. In contrast, the proof-theoretic techniques provide ways 
that such information can be "mined" from a specific proof. If one is primarily 
interested in uniformity, however, the methods here have the virtue of being easy 
to understand and apply. 



In this section we review standard ultraproduct constructions in analysis; see 
[H US El EE] for more details. 

Let / be any infinite set, and let D be a nonprincipal ultrafilter on /. (Below, 
we will always take / to be N.) Any bounded sequence (r^g/ of real numbers has 
a unique limit with respect to D, r = lim^c r^; this means that for every e > the 
set {i € / | \ri — r\} is in D. Suppose that for each i, (Xi, di) is a metric space with 
a distinguished point et^. Let 



where (xi) ~ (yi) if and only if lim^D d(xi,yi) = 0. Let d x be the metric on Xao 
defined by doo((xi), (yi)) = lim,^ d(xi, y,). Leaving the dependence on the choice 
of the base points implicit, we will call this an ultraproduct of the metric spaces 
(Xi, di), denoted by (Y\i(Xi, di)) D . If there is a uniform bound on the diameters of 
these spaces, the choice of the sequence (a,) of "anchor points" is clearly irrelevant. 

This ultraproduct construction is an instance of Luxemburg's nonstandard hull 
construction [33]. We can extend it to ultraproducts of a sequence (Xi) of normed 
spaces using et^ = and the distance given by the norm. Ultraproducts of Banach 
spaces were introduced by Dacunha-Castelle and Krivine |10] , and are an important 
tool in a number of branches of analysis (see e.g. [16]). 

In first-order model theory, one can take an ultraproduct of any sequence of 
structures Mi, and Los's theorem says that any first-order sentence ip is true in the 
ultraproduct if and only if it is true in almost every Mi, in the sense of D; in other 
words, if and only if {i \ Mi |= ip} G D. The constructions above, however, are not 
ultraproducts in the first-order sense, since we restrict to "finite" elements, mod 
out by infinitesimal proximity ~, and (implicitly, by taking limits with respect to 
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D) pass to the standard part of nonstandard distances and norms. This gives rise 
to two complications. 

First, if we extend the metric or normed spaces with other functions, their lifting 
to the ultraproduct will not be well defined if they fail to map finite elements 
to finite elements, or fail to respect ~. We can lift, however, any family (/j) 
of functions that satisfies an appropriate uniform boundedness condition (roughly, 
elements of the family are uniformly bounded on bounded sets around the base 
point) and an appropriate uniform continuity condition (which is to say that there 
is a uniform modulus of uniform continuity on such sets). The resulting function 
on the ultraproduct will be denoted (Yl i fi) D - For details, see [16j Section 4] or [7j 
Section 4] . 

Second, Los's theorem needs to be modified. One strategy, described in [16], is 
to restrict attention to a class of positively bounded formulas. These are formulas 
generated from atomic formulas r < t and t < r, where t is an appropriate term and 
r is rational, using only the positive connectives A and V, as well as universal and 
existential quantification over compact balls in the structure. An approximation to 
such a formula is obtained by replacing each r in an atomic formula r < t by any 
r' < r, and each r in an atomic formula t < r by any r' > r. Say that a formula ip 
with parameters is approximately true in a structure if every approximation ip' to 
ip is true in the structure. One can then show that if a%, . . . , a n are elements of the 
ultraproduct with each a,j represented by the sequence (a^i)^/, then a positively 
bounded formula <p(&i, . . . , a n ) is approximately true in the ultraproduct (J^ Mi) D 
if and only if 

{i | Mi \= ip'(ai,i, a rhi )} e D 
for every approximation ip' to ip. 

Suppose r is a set of positively bounded sentences, and C is the class of structures 
A4 that approximately satisfy each sentence in T. The previous equivalence implies 
that C is closed under ultraproducts. In fact, Henson and Iovino [El Proposition 
13.6] show that a class of structures C can be axiomatized in this way if and only 
if C is closed under isomorphisms, ultraproducts, and ultraroots. 

Another strategy, described in [7 , is to modify first-order semantics so that for- 
mulas take on truth values in a bounded interval of reals, in which case the truth 
value of a formula ip in the ultraproduct is the D-limit of its truth values in the 
individual structures. Spelling out the details here would take us too far afield. Be- 
low we will only use the fact that certain classes of structures and hypotheses are 
preserved under ultraproducts, as well as the easy fact that a quantifier-free posi- 
tively bounded formula ip is true in a structure if and only if every approximation 
to it is true, thereby simplifying the equivalence above. 

3. Examples 

Let T be any nonexpansive operator on a Hilbert space, H, let / be any element 
of H, and for each N > 1 let A^f denote the ergodic average ^^InKN^f- 
Riesz's generalization of von Neumann's mean ergodic theorem states that the 
sequence (A^f) of averages converges in the Hilbert space norm. The following 
generalization is due to Lorch [32] , but also a consequence of results of Riesz [37] , 
Yosida [49], and Kakutani [20] from around the same time (see [28] p. 73]). A 
linear operator T on a Banach space B is power bounded if there is an M such that 
||T n || < M for every n. 
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Theorem 3.1. If T is any power-bounded linear operator on a reflexive Banach 
space B, and f is any element of B, then the sequence (An /)jvsn converges. 

As noted in Section [TJ even in the original von Neumann setting there is no 
uniform bound on the rate of convergence. Moreover, a rate of convergence is 
generally not computable from the given data [51 US] ; see also the discussion in [3J 
Section 5]. However, we can obtain a strong uniformity if we shift attention to 
metastability. 

Theorem 3.2. Let C be any class of Banach spaces with the property that the 
ultraproduct of any countable collection of elements of C is a reflexive Banach 
space. For every p > 0, M , and function F : N — > N, there is K such that the 
following holds: given any Banach space B in C , any linear operator on B satisfying 
\\T n \\ < M for every n, any f £ B, and any e > 0, if ||/||/e < p, then there is an 
n < K such that \\Aif — Ajf\\ < e for every i,j £ [n,F(n)\. 

Proof. Scaling, we can restrict attention to elements / such that ||/|| < 1. Fix 
C, and suppose the claim is false for p = 1/s and F. For each k in N, choose a 
counterexample, that is, a Banach space Bk £ C, a linear operator Tk such that 
\\T£\\ < M for every n, and an element fk such that ||/fe|| < 1 and for every n < k 
there are i,j £ [n, F(n)} such that ||Ai/ m — Ajf m \\ > e. 

The fact that each ||Tfc|| < M for every k guarantees that the family (Tk) sat- 
isfies the uniform boundedness and uniform continuity conditions. Let D be a 
nonprincipal ultrafilter on N, and let B = (Y\ k Bk) D be the Banach space ultra- 
product, with T — (Yl k Tk) D and / = (Y\ k fk) D - By hypothesis, B is reflexive. We 
have ||T n || < M for every n, since this is true of each Tk- Moreover, for every n, 
\\Aif — Ajf\\ > e for some i,j £ [n,F(n)], since this is true of the elements fk in 
all but finitely many of the structures Bk- This contradicts Theorem 13. II □ 

The class C of all reflexive Banach spaces does not satisfy the hypothesis of 
Theorem 13.21 which is to say, an ultraproduct of reflexive Banach spaces need 
not be reflexive. However, there are interesting classes C to which the theorem 
applies. For example, every uniformly convex Banach space is reflexive, and if one 
fixes a modulus of uniform convexity, the class of uniformly convex spaces with 
that modulus is closed under ultraproducts. Thus, Theorem 13.21 guarantees the 
existence of a uniform bound on the rate of metastability that depends only on p, 
M , F, and the modulus of uniform convexity. 

For another example, say that a Banach space B is J-(n, e) convex if for every 
xi, . . . ,x n in the unit ball of B there is a j, 1 < j < n, such that 



A space is J -convex if and only if it is J-(n,e) convex for some n > 2 and e > 0. 
Pisier [34 shows that a Banach space is J-convex if and only if it is super-reflexive, 
so, in particular, every J-convex space is reflexive. Moreover, it is immediate from 
the form of the definition that, for fixed n > 2 and e > 0, the class of J-(n, e) convex 
Banach spaces is closed under ultraproducts. Thus, Theorem 13.21 once again guar- 
antees the existence of a uniform bound on the rate of metastability that depends 
only on p, M, F, n, and e. Note that for n = 2, a space is J-(n, e) convex for some 
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£ > if and only if it is uniformly non-square, a weakening of strict convexity due 
to James [18]. 

The list of classes of structures to which Theorem 13.21 applies can easily be 
extended. For example, we can obtain many classes of spaces that satisfy the 
hypothesis of that theorem by simply fixing bounds on appropriate parameters in 
the various characterizations of superstability given by Pisier in Chapter 3 of [34j . 
Other examples of classes of reflexive spaces that are closed under formation of 
ultraproducts can be found in [3U] 123 13S] ■ 

In the case of a nonexpansive map on a uniformly convex Banach space, Avigad 
and Rute [5] provide an explicit uniform bound on the number of e-fluctuations of 
the sequence (A n f), in terms of p and the modulus of uniform convexity. In the 
case of a nonexpansive map on a Hilbert space, Jones, Ostrovskii, and Rosenblatt 
[19j provide an even stronger result, in the form of an explicit square-function 
inequality for the sequence (A n f). We do not know the extent to which these 
stronger uniformities extend. (Safarik and Kohlenbach [39] provide some general 
conditions that guarantee that it is possible to compute a bound on the number of 
e-fluctuations.) 

As noted in the introduction, the method of proving Theorem l3.2l is quite general, 
applying to any class of structures and set of hypotheses that are preserved under 
ultraproducts. (Restricting to elements / with ||/|| < 1 is needed to ensure that the 
ultraproduct of such elements gives rise to an element of the ultraproduct.) In fact, 
any it is enough to know that some ultraproduct satisfies the relevant hypothesis, 
allowing flexibility in the choice of ultrafiltcr. Rather than state all this formally, 
we will illustrate with two additional examples, with respect to which notions of 
metastability have been considered in the past. 

For the first example, we consider extensions of the mean ergodic theorem to "di- 
agonal averages." Furstenberg's celebrated ergodic-theoretic proof of Szemeredi's 
theorem involves averages of the form 

n £ — ' J 

i<n 

where T\ , . . . , Tj are commuting measure-preserving transformations of a finite mea- 
sure space (X,X,n). Settling a longstanding open problem, Tao [42] showed that 
such sequences always converge in the L 2 (A) norm. This result was recently gen- 
eralized by Walsh [IB] , as follows: 

Theorem 3.3. Let (A, X,/i) be a finite measure space with a measure-preserving 
action of a nilpotent group G. Let T\, . . . ,Tj be elements of G, and let 

(Pi,j)i=l,...,l;j=l,...,d 

be a sequence of integer-valued polynomials on Z. Then for any jfi, . . • , /<j S 
L°°(A, X, fi), the sequence of averages 

N d 

i^n( T r M(n) --- T f M(n) )^ 

converges in the L 2 (X) norm. 

When the relevant data T,p are clear, it will be convenient to write j4jv(/) for 
these averages. Once again, a compactness argument yields the following unifor- 
mity: 
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Theorem 3.4. For every r, I, d, s, p > 0, and function F : N — » N, there is a K 
such that the following holds: given a nilpotent group G of nilpotence class at most 
r, elements T±, . . . , T; in G, a sequence {pi,j)i=i,....i-j=i d of integer-valued poly- 
nomials on 1 of degree at most s, a probability space (A, X, p), measure-preserving 
action of G on (A, X, p), and any sequence of elements /l, • • • G L°°(A, X, p), 
if ||/i||oo/£ < P for each i, then there is an n < K such that ||j4j(/) — < £ 

for every i,j € [n, F(n)]. 

As above, we can restrict attention to the case where ||/j||oo < 1 m the statement 
of the theorem, and without loss of generality we can assume that G is generated 
by Tx , . . , , Tj. An ultraproduct construction due to Loeb |31) , analogous to the 
constructions described in Section [TJ can be used to amalgamate a sequence of 
measure spaces (X^, Xk, Pk) to a measure space (X, X,p), and since first-order 
properties of discrete structures are preserved under ultraproducts, the ultraproduct 
of a sequence (Gk) of groups of nilpotence class at most r is again a group of 
nilpotence class at most r. A measure-preserving action of each Gk on Xk,pk) 
gives rise to a measure-preserving action of G on {X, X, p), and the product of 
the spaces L 2 (Xk, Xk, Pk) embeds isometrically into the space L 2 (X, X, p) (see, for 
example, [HI Section 5]). 

There is a catch, though: the ultraproduct of a sequence of polynomials pk with 
coefficients in 1 need not be a polynomial, since the coefficients can "go off to 
infinity." One could rule that out by assuming that there is a uniform bound on 
those coefficients, in which case the value K in the statement of the theorem would 
depend on that bound as well. As it turns out, however, in this particular case 
there is a trick that eliminates the dependence on this parameter. Call a sequence 
(<?„) of elements of the form g n = T^ 1 ^ ■ ■ - Tf 1 ^ a polynomial sequence. 

Lemma 3.5. Let G be a nilpotent group, and let (g n ) be a polynomial sequence 
of elements of G as above. Then there is a nilpotent extension r\ : G — > G and 
elements r and c such that for every n, g n = rj(T n c). Moreover, there is a bound 
on the nilpotence class of G that depends only on bounds on the nilpotence class of 
G, the number I of polynomials, and a bound on their degrees. 

Via r?, the action of G on X lifts to an action of G on X, whereby the action of 
g n lifts to the action of r n c. Applying the lemma d times, we can thus assume that 
each polynomial sequence gi. n — T^ 1 ' 1 ■ ■ ■ T^'^ appearing in the statement of 
Walsh's theorem is of the form r™Ci for some Ti and Ci in G, at the expense of 
increasing the nilpotence rank of G. 

Lemma 13.51 is a special case of a construction carried out by Leibman |29j in the 
more general setting of an action of Lie group, with both continuous and discrete 
elements. We are grateful to Terence Tao for bringing this lemma to our attention, 
and pointing out that it can be used to obtain a stronger uniformity in the statement 
of Theorem 13.41 As Leibman points out, an instance of this trick was used by 
Furstenberg [111 page 31]. Leibman's construction can be divided into two parts: 
Proposition 3.14 of [29] shows how to define a nilpotent extension 77 : G —> G, a 
unipotent automorphism r of G, and an element c of G, such that for every n, 
g(n) = r/(T n (c)); and Proposition 3.9 shows that the extension G of G by r is again 
a nilpotent group. Here, saying that r is a unipotent automorphism means that 
the mapping £(a) = r(a)a^ 1 has the property that £ q is the identity for sufficiently 
large q. The proof of Proposition 3.9 gives an explicit bound on how large q has to 



ULTRAPRODUCTS AND METASTABILITY 



7 



be and the nilpotence class of G; and Proposition 1 of Gruenberg [14] then provides 
the requisite bound on the nilpotence class of G. 

With this lemma in hand, we can prove Theorem 13.41 

Proof. As above, we can restrict attention to the case where ||/t||oo < 1 m the 
statement of the theorem. Using Lemma l375l we can moreover assume d = 21, s = 1, 
and for every i, Pi^i(n) = n l , Pi,2i+i(n) — 1, and pij = for all other j, so that the 
iih polynomial sequence is given by gi, n = T^^i+i- Given p > and F such that 
the claim is false, for each k choose a probability space (Xk,Xk, Mfc), a group Gk of 
nilpotence class at most r, and elements Ti,/t, . . . , Tj^, and elements /i^, . . . , fd,k 
with infinity norm at most 1 such that for every n <k, \\Ai(fk) — A;'(/fe)|| > £ for 
some i, j e [n,F(n)]. 

Fix a nonprincipal ultrafilter D on N. Let (X, X, /i) be the result of applying the 
Loeb construction to the sequence of spaces (X^,Xki Aft)) l et G be the ultraprod- 
uct of the sequence (Gk) with respect to D, and for each i, let Tj = (YlkTi,k) D - 
Then G has nilpotence class at most r, and each Tj is measure-preserving trans- 
formation of X. But then, as in the proof of Theorem I3.2[ the elements fi = 
(Ilk h,k) D ,---,fd = (life fd,k) D yield a counterexample to Theorem[231 □ 

Tao 40 shows that one can alternatively formulate Walsh's theorem in algebraic 
terms, which allows one to avoid the reference to the Loeb construction in the proof 
of Theorem 13.41 In fact, both Walsh's original proof [46] and Tao's later proof of 
Walsh's result [30] establish Theorem 13.41 directly. Tao's proof of his prior result 
[42] also established the corresponding uniformity, but there are now other proofs 
of that theorem that do not [2] [TT1 [44] . Tao [40] emphasizes that Theorem [3T4] is 
stronger than Theorem 13.3) the observation here is that they are essentially the 
same, modulo compactness and Lemma 13.51 

We consider a final example, this time from nonlinear ergodic theory. Fix a 
Hilbert space %. Let C be a bounded, closed, convex subset of H, and let T be a 
nonexpansive map from C to C. Let (A n ) be a sequence of elements of [0, 1], and 
let / and u be any elements of C . The Halpern iteration corresponding to T, (A n ), 
/, and u is the sequence given by 

fa = fi fn+i = Ki+iu + (1 — X n+ i)Tf n . 

If T is linear, u = f, and X n = l/(n + 1), then (f n ) is the familiar sequence (A n f) 
of ergodic averages. Wittmann [4T showed that, assuming the set of fixed points 
fixed points of T is nonempty, the following conditions on the sequence (X n ) suffice 
to ensure that the sequence /„ of Halpern iterates converges to the projection onto 
the space of fixed points: 

• lim^oo A„ = 

• _ converges 

• ££Li A « = °°- 

In particular, these are satisfied when A„ = l/(n + 1). 

The linear structure of H only comes into play in the assumption that C is 
convex. Seajung [3H] has generalized Wittmann's result to CAT(O) spaces, which 
are metric spaces with an abstract notion of "linear combination," that is, metric 
spaces equipped with a function W(x, y, A) which, intuitively, plays the role of 
(1 — \)x + Xy. The specific axioms that W is assumed to satisfy can be found 
in EH |38]; we only need the fact, established in [91 pages 77-78], that the 
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ultraproduct of CAT(O) spaces is again a CAT(O) space. Saejung's theorem states 
the following: 

Theorem 3.6. Let C be a closed convex subspace of a complete CAT(0) space, 
and let T : C — > C be a nonexpansive map such that the set of fixed points of T is 
nonempty. Suppose (A n ) satisfies the three conditions above. Then for any u, f in 
C , the sequence of Halpern iterates (f n ) converges to the projection of u onto the 
set of fixed points ofT. 

Kohlenbach and Leu§tean }25j have shown that there is a uniform bound on 
the rate of metastability, given by a primitive recursive functional. If one is only 
interested in uniformity and not the particular rate, the following provides a quick 
proof: 

Theorem 3.7. Fix (A„) satisfying (1-3) above. For every e > 0, M, and function 
F : N — > N, there is a K such that the following holds: given a CAT(O) space 
(X, d, W), a closed convex subset C of X with diameter at most M , a nonexpansive 
map T : C —> C with a fixed point in C , and f, u in C, if (/„) denotes the sequence 
of Halpern iterates, then there is an n < K such that d{fi, fj)<e for every i,j in 
[n,F(n)}. 

Proof. Once again, use an ultraproduct construction to amalgamate a sequence 
of purported counterexamples. We have already noted that the ultraproduct of 
CAT(O) spaces is again a CAT(O) space. The uniform bound on the diameter of 
each of the sets C is also a bound on the diameter of their product. The fact that 
convexity is preserved is immediate; and it is well known that an ultraproduct of 
closed sets is again closed (see, for example, Proposition 5.3]). □ 

Theorem 13.71 can also be seen as a consequence of Corollary 4.25 in Gerhardy 
and Kohlenbach [12], modulo verification of the fact that Saejung's theorem can 
be derived in the formal axiomatic system mentioned there. As Gerhardy and 
Kohlenbach note, one can weaken the hypothesis that T has a fixed point in C to 
the hypothesis that T has an e- fixed point in C for every e > 0. This is easy to 
see from the ultraproduct argument as well, since the ultralimit e-fixed points for 
a sequence e decreasing to is an actual fixed point. This fact is commonly used 
in applications of ultraproducts to fixed-point theory; see, for example, Aksoy and 
Khamsi pQ. 
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